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We pursue the development and apphcation ol the recently-introduced linear optimization method 
for determining the optimal linear and nonlinear parameters of Jastrow-Slater wave functions in a 
variational Monte Carlo framework. In this approach, the optimal parameters are found iteratively 
by diagonalizing the Hamiltonian matrix in the space spanned by the wave function and its first-order 
derivatives, making use of a strong zero-variance principle. We extend the method to optimize the 
exponents of the basis functions, simultaneously with all the other parameters, namely the Jastrow, 
configuration state function and orbital parameters. We show that the linear optimization method 
can be thought of as a so-called augmented Hessian approach, which helps explain the robustness 
of the method and permits us to extend it to minimize a linear combination of the energy and the 
energy variance. We apply the linear optimization method to obtain the complete ground-state 
potential energy curve of the C2 molecule up to the dissociation limit, and discuss size consistency 
and broken spin-symmetry issues in quantum Monte Carlo calculations. We perform calculations 
of the first-row atoms and homonuclear diatomic molecules with fully optimized Jastrow-Slater 
wave functions, and we demonstrate that molecular well depths can be obtained with near chemical 
accuracy quite systematically at the diffusion Monte Carlo level for these systems. 
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I. INTRODUCTION 

Quantum Monte Carlo (QMC) methods (see e.g. 
Rcfs. 1-3) constitute an alternative to standard quan- 
tum chemistry approaches for accurate calculations of 
the electronic structure of atoms, molecules and solids. 
The two most commonly used variants, variational Monte 
Carlo (VMC) and diffusion Monte Carlo (DMC), use a 
flexible trial wave function, generally consisting for atoms 
and molecules of a Jastrow factor multiplied by a short 
expansion in configuration state functions (CSFs), each 
consisting of a linear combination of Slater determinants 
of orbitals expanded in a localized one-particle basis. To 
fully benefit from the considerable flexibility in the form 
of the wave function, it is crucial to be able to efflciently 
optimize all the parameters in these wave functions. 

In recent years, a lot of effort has been devoted to de- 
veloping efflcient methods for optimizing a large number 
of parameters in QMC wave functions. One the most ef- 
fective approaches is the linear optimization method of 
Refs. 4 and 5. This is an extension of the zero-variance 
generalized eigenvalue equation approach of Nightingale 
and Melik-Alaverdian [6] to arbitrary nonlinear param- 
eters, and it permits a very efficient and robust energy 
minimization in a VMC framework. This method was 
applied successfully to the simultaneous optimization of 
Jastrow, CSF and orbital parameters of Jastrow-Slater 
wave functions for some all-electron atoms and molecules 
in Refs. 4 and 7, and for the C2 and Si2 molecules with 



pseudopotentials in Ref. 5. It has also been applied in 
Ref. 8 for optimizing Jastrow, CSF and backflow param- 
eters to obtain very accurate wave functions for the first- 
row atoms. 

In this paper, we pursue the development and applica- 
tion of the linear optimization method. We extend the 
method to optimize the exponents of the basis functions, 
simultaneously with all the other parameters, a capacity 
rarely available in standard quantum chemistry methods 
(see, however, Refs. 9-11). This uses a slight generaliza- 
tion of the parametrization employed for Jastrow-Slater 
wave functions to allow nonorthogonal orbitals. Also, we 
show that the linear optimization method can be thought 
of as a so-called augmented Hessian approach. This al- 
lows us to clearly establish the connection between the 
linear optimization method and a stabilized Newton op- 
timization method, helping to explain the robustness of 
the approach. Moreover, this formulation permits us to 
extend the method to minimize a linear combination of 
the energy and the energy variance, as done in Ref. 12 
with the Newton method. We then apply the linear op- 
timization method to obtain the complete ground-state 
potential energy curve of the C2 molecule up to the dis- 
sociation limit, and discuss size consistency and broken 
spin-symmetry issues in QMC calculations. Finally, we 
perform calculations of the first-row atoms and homonu- 
clear diatomic molecules with fully optimized Jastrow- 
Slater wave functions, and we demonstrate that molec- 
ular well depths can be obtained with near chemical ac- 
curacy quite systematically at the DMC level for these 
systems. 
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When not explicitly indicated, Hartree atomic units 
are assumed throughout this work. 
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II. WAVE FUNCTION OPTIMIZATION 

A. Wave function parametrization 

To optimize a large number of parameters of a wave 
function, it is important to use a convenient and efficient 
parametrization that eliminates redundancies. We use an 
iV-electron Jastrow- Slater wave function parametrized as 
(see Ref. 4) 

|*(p))=J(a)e^(-«) J2 cilCjiO), (1) 
1=1 

where J(a) is a Jastrow operator, e"^'*'''-' is an orbital 
transformation operator, and |C/(C)) are configuration 
state functions (CSFs). The parameters to be optimized, 
collectively designated by p = {a.,c, k,C), are the Jas- 
trow parameters a, the CSF parameters c, the orbital 
rotation parameters k and the basis exponent parame- 
ters C. 

We use a flexible Jastrow factor consisting of the 
exponential of the sum of electron- nucleus, electron- 
electron and electron-electron- nucleus terms, written as 
systematic polynomial and Fade expansions [13] (see also 
Refs. 14, 15). Each CSF is a short linear combination of 
products of spin-up and spin-down Slater determinants 
\DliC)) and \DliC)) 

|C/(C)>=E^^.k|i^i(C)>l^,^(C)>, (2) 

k 

where the coefficients d/ k are fully determined by 
the spatial and spin symmetries of the state con- 
sidered. The A'^i-electron and A'^j -electron Slater 
determinants are generated from the set of non- 
necessarily orthonormal orbitals at the current optimiza- 
tion step, \DiiC)) - al\^ iC)&Z^ iO ■ ■ ■ al^^iC)\y^c) 

and \DiiC)) = «°4,u(0<,.i(C)---«r„i(C)|vac), 

where a^^^(^) (with a =|, J,) is the fermionic creation op- 
erator for the orbital |0"(C)) (the superscript referring 
to the current optimization step) in the spin-cr determi- 
nant, and |vac) is the vacuum state of second quantiza- 
tion. The (occupied and virtual) orbitals are written as 
linear combinations of A^bas basis functions |Xai(Cp)) with 
current coefficients A^. ^ 

I0?(c)) = E ^".mIxm(Cm)>- (3) 

Specifically, in this work, we use Slater basis functions 
whose expression in position representation, using spher- 
ical coordinates r = (r, 0, 0) around an atom position r^, 
is 



where A^„(C) = ^(2C)2"+V(2n!) is the radial normal- 
ization constant and Si^rni(^,4') arc the normalized real 
spherical harmonics. 

Some parameters in the Jastrow factor are fixed by 
imposing cusp conditions [16] on the wave function; the 
other Jastrow parameters are varied freely. Due to the ar- 
bitrariness of the overall normalization of the wave func- 
tion, only A^csF — 1 CSF coefficients need be varied, e.g. 
the coefficient of the first CSF is kept fixed. The only re- 
strictions that we impose on the exponents are the equal- 
ity of the exponents of the basis functions composing the 
21 + 1 components of spherical harmonics having the same 
I (e.g., Px, Py and Pz) and, of course, the equality of 
the exponents of symmetry-equivalent basis functions on 
equivalent atoms. Finally, the parametrization of the or- 
bital cocfhcients through the orbital rotation parameters 
K, to which we shall come next, allows one to conve- 
niently eliminate the redundancies due to the invariance 
properties of determinants under elementary row opera- 
tions. 

Because straightforward variation of the exponents ^ of 
the basis functions results in orbitals being nonorthogo- 
nal, in this work we use an orbital optimization formalism 
that applies to nonorthogonal orbitals. The general idea 
of this formalism appears also in valence bond theory (see 
Rcfs. 17 -19) and is a direct generalization of the formal- 
ism for optimizing orthonormal orbitals used in standard 
multiconfiguration self-consistent field (MCSCF) theory 
(see, e.g., Ref. 20), and, in a QMC context in Ref. 4. 
At each optimization step, the orbitals are transformed 
using the operator e"'*^' where k(k, C) is the total real 
singlet orbital excitation operator 

kiKX)=Y.^kiEki{C), (5) 

where the parameters Kki are nonzero only for nonredun- 
dant orbital pairs (see below) and Eki{C) is the singlet 
excitation operator from orbital I to orbital k 

^«(C) -a°t(C)&?T(C) + «°l(0^?i(C), (6) 

where 6^^ = J2q(^^^)iq^qa is the dual orbital annihi- 
lation operator at the current optimization step written 
in terms of the usual annihilation operators a^^ and the 
inverse of the overlap matrix O of the orbitals with ele- 
ments Oiq = The operators aj.^ and satisfy 
the canonical anti-commutation relations {a^j^, O/'J/} = 0, 
{HaM.'} = and {ali,bl,} = SkiSaa'- The action of 
the operator Eki{C) on a spin-cr Slater determinant is 
simply to replace the I spin-cr orbital by the k spin-cr 
orbital. Thus, in practice, the calculation of the orbital 
overlap matrix O is not needed. In contrast to the case 
of orthonormal orbitals, the operator k(/«, ^) is not anti- 
Hermitian and thus the operator e"^*^' ^' is not unitary. 
The action of this operator on a Slater determinant is 
seen by inserting e"'^^''' ^^e'^^''' = 1 after each orbital 
creation operator making up the determinant and using 
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Ojvac) = |vac); this leads to a new Slater determi- 
nant made with the transformed orbital creation opera- 
tors 

aLKC) = e^('^^«)a°:(C)e-'^('^'<), (7) 
and, accordingly, the corresponding transformed orbitals 



(8) 



where the sum is over all (occupied and virtual) orbitals, 
and {e'^)ik are the elements of the transformation ma- 
trix e** constructed as the exponential of the matrix k 
with elements n^i- The nonredundant orbital excita- 
tions / k have already been discussed in Ref. 4. For 
a single-determinant wave function, the nonredundant 
excitations are: closed — *■ open, closed virtual and 
open — > virtual. For a multi-configuration complete ac- 
tive space (CAS) [21] wave function, the nonredundant 
excitations are: inactive — > active, inactive — > secondary 
and active — > secondary. For both single-determinant 
and multi-determinant CAS wave functions, ii I k is 
an allowed excitation in Eq. (5) then the action of the 
reverse excitation fc — > Z is zero, and one can choose to 
impose the condition k/^ = —ku- In this case, /« is a 
real anti-symmetric matrix and thus e'' is an orthogonal 
matrix that simply rotates the orbitals. For a general 
multiconfiguration wave function (not CAS), some ac- 
tive — !■ active excitations must also be included and the 
action of the corresponding reverse excitation is gener- 
ally not zero. If the reverse excitation is independent, 
one does not have to enforce the orthogonality condition 
Kik = —Kki, in which case the transformation of Eq. (8) 
is no longer a rotation. Of course, in addition to these re- 
strictions, only excitations between orbitals of the same 
spatial symmetry have to be considered. We note that 
this orbital optimization formalism is well suited for the 
use of localized orbitals, although we do not explore this 
possibility in this work. 

We note that it is also possible to keep the orbitals ex- 
actly orthonormal when varying the basis exponents (see 
Refs. 9, 10). If one starts from orthonormal orbitals and 
uses basis functions that are for instance symmetrically 
orthonormalized [22, 23] 



(9) 



where B is the overlap matrix of the basis functions with 
elements ~ {xiy\Xti)j then the orthonormality of the 
orbitals is preserved during the optimization. However, 
this complicates the calculation of the derivatives of the 
wave function with respect to the exponent parameters 
(in particular, one needs to calculate the derivatives of 
the matrix B~^/^ as done in Ref. 24), and the computa- 
tional effort of the optimization is significantly increased. 
In practice, as discussed in Section IV, wc have found 



that using orthogonalized basis functions does not signif- 
icantly reduce the number of iterations needed to reach 
convergence. 

We denote by Np the total number of parameters to be 
optimized. The parameters at the current optimization 
step are denoted by = (q:",c",/«" = 0,^°) and the 
corresponding current wave function by 



|*o) = |*(p°))=J(aO)5^c?|Cz(0). 

1=1 



(10) 



B. First-order derivatives of the wave function 

We now give the expressions for the first-order deriva- 
tives of the wave function |5'(p)) of Eq. (1) with respect 
to the parameters at p = p° 



I*,:) = 



/ p=pO 



(11) 



which collectively designate the derivatives with respect 
to the Jastrow parameters 



1*..) = ^^ E'c?|c^/(0), 



(12) 



1=1 



with respect to the CSF parameters 

\^,,) = J(a°)|C,(C'')), (13) 
with respect to the orbital parameters 



Ncs 



= J(««) E c?^,KC°)|CKC°)), (14) 



1=1 



and with respect to the exponent parameters 



I* 



Cm/ 



1=1 



(15) 



The derivatives with respect to the orbital parameters in 
Eq. (14) are thus simply generated by single excitations 
of orbitals out of the CSFs. In the derivatives with re- 
spect to the exponent parameters in Eq. (15), the orbital 
transformation operator e"^'^'''-' in Eq. 1 docs not con- 
tribute since the orbitals are transformed at each step so 
that we always have kP ^ 0. 



C. Linear optimization method 

We use the linear optimization method of Refs. 4 
and 5 to optimize all the parameters in our wave func- 
tions. This is an extension of the zero- variance gener- 
alized eigenvalue equation approach of Nightingale and 
Melik-Alaverdian [6] to arbitrary nonlinear parameters, 
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and it permits a very robust and efficient energy min- 
imization in a VMC context. We review here this ap- 
proach from a somewhat different perspective and show 
how the method can be extended to minimize a linear 
combination of the energy and the energy variance. 

At each step of the optimization, the quantum- 
mechanical averages arc computed by sampling the prob- 
ability density of the current wave function \E'o(R) = 
(Rj^'o) in the A^-clectron position representation |R) ~ 
|ri, r2, • • • , rjv). We will denote the statistical average of 

a local quantity, /(R), by (/(R)) = (l/A/)Efcli /(Rfe) 
with M electron configurations R^.. 



1. Minimization of the energy 

Deterministic optimization method. The idea of the 
method is to iteratively: 

(i) expand the normalized wave fmiction |5'(p)) = 
l^(p)) / V (^(p)l'J'(p)) to first order in the parameter 
variations Ap = p — p*^ around the current parameters 



l^lin(p)) =|Wo)+£Ap, 
i=l 



(16) 



where |^'o) ~ |5'(p*')) is the normalized current wave 
function, and 



1 



v/(*o|*o) 



(17) 



are the first-order derivatives of the normalized wave 
function |vl/(p)) with respect to the parameters at p°, 
written in terms of the first-order derivatives j'l'j) of the 
unnormalized wave function |5'(p)) given in Section II B; 
(ii) minimize the expectation value of the Hamiltonian 
H over this linear wave function with respect to the pa- 
rameter variations Ap 



Eiin = min 



(*li„(p)|i7|vl/un(p)) 



(18) 



Ap (*ii„(p)|vl/iin(p)) 

(iii) update the current parameters as p° — > p° 

The energy minimization step (ii) can be written in 
matrix notation as 



Ap. 



1 Ap^ 



E'lin = min ■ 

Ap 



g/2 H 



1 

Ap 



1 ApT) 



1 0^ 
s 



1 

Ap 



(19) 



where Eq = (^'o|-ff|^'o) is the current energy, g is the 
gradient of the energy with respect to the Np parameters 
with components gi = 2(4'i|if|^'o), H is the Hamiltonian 
matrix in the basis consisting of the Np wave function 
derivatives with elements Hij = {'^i\H\'^j), and S is the 



overlap matrix in this basis with elements 5*^ 



Clearly, the minimization of Eq. (19) is equivalent to solv- 
ing the following {Np + l)-dimensional generalized eigen- 
value equation 



Eo \ ( 1 

g/2 H ; V 



E^ 



lin 



1 oj 
s 



1 

Ap 



(20) 



for its lowest eigenvector. Each optimization step thus 
consists of a standard Rayleigh-Ritz approach. When 
applied to the specific case of the orbital rotation pa- 
rameters, this linear optimization method is known in 
the quantum chemistry literature as the super configu- 
ration interaction or generalized Brillouin theorem ap- 
proach [25-28] . It can also be seen as an instance of the 
class of augmented Hessian optimization methods with a 
particular choice of the matrix H, sometimes also called 
rational function optimization methods, which are based 
on a rational quadratic model of the function to optimize 
at each step rather than a quadratic one, and which are 
known to be very powerful for optimizing MCSCF wave 
functions and molecular geometries (see, e.g., Refs. 29- 
38). 

We note that after solving Eq. (20), and before updat- 
ing the current parameters, the parameter variations Ap 
can be advantageously transformed according to Eq. (32) 
of Ref. 4 which corresponds to changing the normaliza- 
tion of the wave function |5'(p)) [69]. 

Implementation in variational Monte Carlo. Nightin- 
gale and Melik-Alaverdian [6] showed how to realize ef- 
ficiently this energy minimization approach on a finite 
Monte Carlo (MC) sample, which is not as obvious as it 
may seem. The procedure makes use of a strong zero- 
variance principle and can be described as follows. If the 
current wave function and its first-order derivatives with 
respect to the parameters {[^E'o), l^*!), ■ ■ ■ jI^'at^)} form 
a complete basis of the Hilbert space considered (or, less 
stringently, span an invariant subspace of the Hamilto- 
nian H), then there exist optimal parameters variations 
Ap so that the linear wave function of Eq. (16) is an 
exact eigenstatc of H, and therefore satisfies the local 
Schrodinger equation for any electron configuration R^ 



li i?lin |vI'o)+^Ap,|vI', 



(21) 



where the Apj 's are independent of Rfc. Multiplying this 
equation by \E'i(Rfc)/5'o(Rfe)^ (where i = 0, 1, Np) and 
averaging over the M points R^ sampled from |^'o(R)|^ 
leads to the following stochastic version of the generalized 
eigenvalue equation of Eq. (20) 



Ml 



Eo ^^§^2 



1 
Ap 



= E, 



lin 



1 

^^s 



1 

Ap 



(22) 



5 



whose lowest eigenvector solution gives the desired op- 
timal parameter variations Ap independently of the MC 
sample, i.e. with zero variance. In Eq. (22). Eq = 
(£'l(R')) is the average of the local energy Ei^CR) = 
(R|i7|^'o)/^'o(R) (the superscript denoting the de- 
pendence on the MC sample), ^gL and ^^g_R are two 
estimates of the energy gradient with components 



Now, in practice for nontrivial problems, the basis 
{l^'o), l^*!), • ■ • ,|*Afp)} is never complete, and conse- 
quently solving Eq. (22) actually gives an eigenvector 
solution ^^Ap and associated eigenvalue ^^E'lin that do 
depend on the MC sample. But this solution is not the 
solution that would be obtained by naively minimizing 
the energy of the MC sample 



9L,i 



^,(R) (R|g|^o) 
¥o(R) ^o(R) 

/*»(R) 



,*o(R 



*o(R) 



(23) 



where ^',(R)/*o(R) = ^'.(R)/*o(R) - (*,(R)/*o(R)) 
has been used, 



9R,j 



*o(R 
*j(R) 



,*o(R) 



^.(R) 
*o(R) 



{E^m 



+ {EL,,m 



(24) 



where Slj(R) = (R|i7|1'j)/*o(R) - 

[«'j(R)/*o(R)]£'l(R) is the the derivative of the 
local energy with respect to the parameter pj (which 
is zero in the limit of an infinite sample), is the 

following nonsymmetric estimate of the Hamiltonian 
matrix 



M 



Hi 



^,(R) (R^Hl^,) \ 
vI/o(R) vl/o(R) / 

*o(R) *o(R) "^^^ 

*,;(R)\ /*j(R) 



+ 



^o(R)/ 


\*o(R) 




/*KR) 


*o(R)/ 


\*o(R) 


*^(R)\ 


/*.(R) 



*o(R) 
'^'^(R) 
«'o(R) 



/ \*o(R) 
El.jCR) 



i?L(R) 

Ei^iR) 
{ELiR)) 

^»(R) 
*o(R) 



(Eum), 

(25) 



and S is the estimated overlap matrix 



AT 



^.(R)^,-(R) 

*o(R) *o(R) 
^,(R) ^,(R) 
*o(R) *o(R) 



vl/,(R)\ /*,(R) 



*o(R)/ \*o(R 



(26) 



(1 ApT 



M 



Eun ^ min ■ 

Ap 



M p 



J\/„T 



1 

Ap 



1 Ap^ 



1 

Ap 



(27) 



which would yield instead a generalized eigenvalue equa- 
tion similar to Eq. (22) but with symmetrized analogues 
of the energy gradients ^^gi, ^^g/? and the Hamiltonian 
matrix ''^^H. In fact, solving the generalized eigenvalue 
equation of Eq. (22) leads to parameter variations with 
statistical fluctuations about one or two orders of mag- 
nitude smaller than the parameter fluctuations obtained 
using the symmetrized eigenvalue equation resulting from 
the minimization of Eq. (27). Of course, in the limit of 
an infinite sample il/ — > oo, the generalized eigenvalue 
equation of Eq. (22) and the minimization of Eq. (27) 
become equivalent. 



2. Minimization of a linear combination of the energy and 
the energy variance 

In some cases, it is desirable to minimize a linear com- 
bination of the energy E and the energy variance V: 
(1 — q)E + qV where < g < 1. For instance, it has been 
shown that mixing in a small fraction of the energy vari- 
ance (e.g., q = 0.05) in the optimization can significantly 
decrease the variance while sacrificing almost nothing of 
the energy [12]. Also, there is a theoretical argument 
suggesting that if one wants to minimize the number of 
MC samples needed to obtain a fixed statistical uncer- 
tainty on the average energy in DMC, then one should 
minimize in VMC a linear combination of the energy and 
the energy variance (usually, the energy dominates by far 
in this linear combination) [39, 40]. Indeed, in practice, 
the number of MC samples needed in DMC is often re- 
duced by a few percent by using q = 0.05 rather then 
q = [12]. Finally, the energy variance may be more 
sensitive to some parameters than the energy. 

The formulation of the linear optimization method as 
an augmented Hessian approach shows clearly how to 
introduce minimization of the energy variance in the 
method. Suppose that, at each optimization step, we 
have some quadratic model of the energy variance to min- 
imize 



Klin = min 

Ap 



K+gy -Ap + ^Ap^ 



hy • Ap ^ (28) 
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where Vq = {'i'o\{H — £"0)^1^0) is the energy variance 
of the current wave function |4'o), gy is the gradient of 
the energy variance with components gv,i = 2(5'; |(H — 
£'o)^|5'o) and hy is some approximation to the Hessian 
matrix of the energy variance. Then, alternatively, one 
could minimize the following rational quadratic model 



(1 ApT 



Vo _ 
gv/2 hy/2 + yoS 



1 

Ap 



Ap 



1 Ap^ 



1 qT 
s 



1 

Ap 



(29) 



which agrees with the quadratic model in Eq. (28) up 
to second order in Ap, and which leads to the following 
generalized eigenvalue equation 



Vo 
gy/2 



g^/2 _ 
hy/2 + yoS 



1 

Ap 



s 



1 

Ap 



On a finite MC sample, the overlap matrix S is estimated 
as before by the expression given in Eq. (26), the energy 
variance is estimated as ^Vb = (Ei^iRy) - (^^L(R))^ 
its gradient is calculated as [12] 



M , 



9v,i 



(-Bl..(R)-Bl(R)) - (-Bl,»(R)) (£^l(R)) 



*.(R) 



Sl(R)' 



*o(R) 
^5L.. (i?L(R)) 



^.(R) 
*o(R) 



(i^L(R)^) 



(31) 



and its Hessian can be approximated by the (positive- 
definite) Levenberg-Marquardt approximation [12] 



Ml 



iy,y 



(i?L,KR)i^L,,(R)) - "9L,^ (i?L,,(R)) 



M 



M M 
9L,i gL,3 



(32) 



We have also tested the use of the exact Hessian of the en- 
ergy variance of the linear wave function of Eq. (16), but 
this Hessian containing a quadricovariance term tends to 
be more noisy than the simpler Hessian of Eq. (32), is 
not guaranteed to be positive definite, and leads to a less 
efficient optimization. 

It is clear that an arbitrary linear combination of the 
energy and the energy variance can be minimized by com- 
bining the augmented Hessian matrix of the energy on 
the left-hand-side of Eq. (22) with the augmented Hes- 
sian matrix of the energy variance on the left-hand-side 
of Eq. (30) with the estimators of Eqs. (31) and (32). 
We note however that this procedure destroys the zero- 
variance principle described in the previous section which 



holds if only the energy is minimized. In practice, intro- 
ducing only a small fraction (~ 5%) of the energy vari- 
ance does not adversely affect the benefit gained from the 
zero- variance principle, and in fact in most cases makes 
the optimization converge more rapidly. We note that 
if we were to undertake the additional computational ef- 
fort of computing (R|if^|5'i) then it is possible to for- 
mulate an optimization method that obeys the strong 
zero-variance principle even when optimizing the energy 
variance or a linear combination of the energy and the 
energy variance. 

In fact, following this procedure, any penalty function 
imposing some constraint, for which we have estimates of 
the gradient and of some approximation to the Hessian, 
can be added to the energy and optimized with the linear 
method, hopefully without spoiling very much the benefit 
of the zero-variance principle for the energy. 



(30) 



3. Robustness of the optimization 



The linear optimization method has been found to be 
somewhat more robust than the Newton optimization 
method of Rcf. 41 using an approximate Hessian, and 
even of that of Ref. 12 using the exact Hessian, for op- 
timizing QMC wave functions. On a finite MC sample, 
when minimizing the energy, the zero-variance principle 
of Nightingale and Melik-Alaverdian [6] is certainly a ma- 
jor ingredient in the robustness of the method. But even 
in the limit of an infinite MC sample, augmented Hes- 
sian approaches are known in the quantum chemistry 
literature to be more robust than the simple Newton 
method. In fact, it has been shown that augmented Hes- 
sian methods can have a greater radius of convergence 
than the Newton method [32]. This can be understood 
by rewriting the generalized eigenvalue equation (20) as 
(see Refs. 32, 34) 



(h-2A£;S) -Ap^-g, 



2AE = gT . Ap, 



(33a) 



(33b) 



where h = 2(H — i?oS) is a A^p-dimensional matrix and 
AE = E'lin — i?o < is the energy stabilization obtained 
from going from the current wave function \'^o) to the 
linear wave function l^'un) (which is necessarily nega- 
tive within statistical noise). Equations (33) show that 
the linear optimization method is equivalent to a New- 
ton method with an approximate Hessian matrix h which 
is level-shifted by the positive-definite matrix — 2A£'S, 
which acts as a stabilizer. This method is thus closely 
related to the stabilized, approximate Newton method 
of Ref. 41. The advantage of the present approach, be- 
side the previously discussed zero- variance principle, is 
that the stabilization constant — 2A_E is automatically 
determined from the solution of the generalized eigen- 
value equation. Clearly, from Eq. (33b), —2AE tends 
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tends to be large far from the minimum, and tends to 
zero at convergence. 

In some cases, when the initial parameters are very bad 
or the MC sample is not large enough, it is necessary to 
further stabilize the optimization. A variety of differ- 
ent stabilization schemes are conceivable. In practice, we 
have found that adding a positive constant Odiag to the 
diagonal of the Hamiltonian matrix, i.e. H ^ H + Odiagl 
where I is the identity matrix, works well. The value of 
fldiag is adjusted at each optimization step by performing 
three very short MC calculations using correlated sam- 
pling with wave function parameters obtained with three 
trial values of Cdiag and predicting by parabolic inter- 
polation the value of Odiag that minimizes the energy, 
with some bounds imposed. In addition, Odiag is forced 
to increase if the norm of linear wave function variation 



tions 



or the norm of the parameter varia- 



Ap|| exceed some chosen thresholds, or if some 
parameters exit their allowed domain of variation (e.g., 
if a basis exponent becomes negative). 



III. COMPUTATIONAL DETAILS 

We now give some details of the calculations that we 
have performed on the first-row atoms and homonuclear 
diatomic molecules in their ground states. 

We start by generating a standard ab initio 
wave function using the quantum chemistry program 
GAMESS [42], typically a restricted Hartree-Fock (RHF) 
wave function or a MCSCF wave function with a com- 
plete active space generated by distributing n valence 
electrons in m valence orbitals [CAS(n,TO)]. We use the 
CVBl Slater basis of Ema et al. [43], each Slater func- 
tion being actually approximated by a fit to 14 Gaussian 
fimctions [44-46] in GAMESS. 

This standard ab initio wave function is then multi- 
plied by a Jastrow factor, imposing the electron-electron 
cusp condition, but with essentially all other free param- 
eters chosen to be initially zero to form our starting trial 
Jastrow-Slater wave function, and QMC calculations are 
performed with the program CHAMP [47] using the true 
Slater basis set rather than its Gaussian expansion. The 
Jastrow, CSF, orbital and exponents parameters are si- 
multaneously optimized with the linear energy minimiza- 
tion method in VMC, using an accelerated Metropolis 
algorithm [48, 49]. We usually start with 10000 MC 
iterations for the first optimization step and then this 
number is progressively increased at each step (typically 
by a factor 1.5 to 4) during the optimization until the 
energy is converged to 10""* (for the lighter systems) or 
10~^ (for the heavier systems) Hartree within a statisti- 
cal accuracy of 5.10"^ or 5.10"'*, respectively. The opti- 
mization typically converges in less than 10 steps. Once 
the trial wave function has been optimized, we perform 
a DMC calculation within the short-time and fixed-node 
(FN) approximations (see, e.g., Refs. 50-54). We use an 
imaginary time step of usually r = 0.01 Hartree"^ in an 
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FIG. 1: Convergence of the VMC total energy of the all- 
electron C2 molecule during the optimization of the 12 ex- 
ponent parameters in a wave function composed of a single 
Slater determinant multiplied by a Jastrow factor. Crude ini- 
tial exponents have been intentionally chosen as the integers 
nearest to the exponent values of the CVBl basis [43]. In this 
calculation, the number of MC sample was initially 10000 and 
this number was progressively increased at each iteration until 
a final statistical uncertainty of 0.5 mHartree was reached. 



efficient DMC algorithm featuring very small time-step 
errors [55]. For Ne and Ne2 we computed the DMC en- 
ergies at four time steps, 0.020, 0.015, 0.01, and 0.005 
Hartree"^, and performed an extrapolation to zero time 
step. 

We found it convenient to start from the CVBl basis 
exponents and from CSF and orbitals coefficients gener- 
ated by GAMESS, but in fact the ability to optimize all 
these parameters in QMC allows us to also start from 
cruder starting parameters without relying on a exter- 
nal quantum chemistry program. In this larger 
number of optimization iterations are needed to achieve 
convergence. 



IV. OPTIMIZATION OF THE BASIS 
FUNCTION EXPONENTS 

In Ref. 4, the optimization of the Jastrow, CSF and or- 
bital parameters has been discussed in detail. We com- 
plete here the discussion with the optimization of the 
exponent parameters. 

Figure 1 shows the convergence of the VMC total en- 
ergy of the all-electron C2 molecule during the optimiza- 
tion of the 12 exponent parameters in a wave function 
composed of a Jastrow factor multiplied by a single Slater 
determinant where the Jastrow and orbital parameters 
have been previously optimized (with the exponents fixed 
at the CVBl values). Crude initial exponents have been 
intentionally chosen as the integers nearest to the expo- 
nent values of the CVBl basis. One sees that the linear 
energy minimization method yields a fast convergence of 
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the energy in about three iterations, typically as fast as 
when optimizing the other parameters. The simultaneous 
optimization of the Jastrow, CSF, orbital and exponent 
parameters generally converges as fast as the simultane- 
ous optimization of only the Jastrow, CSF and orbital 
parameters reported in Rcf. 4. 

When optimizing the exponents without simultaneous 
optimization of the orbitals, we have found the optimiza- 
tion very stable, the introduction of the stabilization con- 
stant Odiag often being unnecessary. However, when opti- 
mizing simultaneously the orbital and exponents param- 
eters, the optimization tends to be less stable because of 
near redundancies between these two sets of parameters, 
and Cdiag typically increases up to 10~^ — 10"'* to retain 
stability. 

Tests on a few atoms have shown that, for the opti- 
mization of exponents only, the use of the orthonormal- 
ized basis functions of Eq. (9) tends to be a bit more 
stable. Typically, the overlap matrix S of the wave func- 
tion derivatives have eigenvalues that span about 7 or- 
ders of magnitude from 1 to 10~^ for (unnormalized or 
normalized) nonorthogonalized functions whereas, for or- 
thonormalized basis functions, the eigenvalues span only 
about 4 orders of magnitude from 1 to 10~^. Thus, or- 
thogonalization of the basis functions reduces the range 
of the eigenvalues, attenuating near redundancies among 
the exponent parameters. However, the wave function 
derivatives with respect to the exponents take signifi- 
cantly longer to compute when using orthonormalized 
basis functions, and in addition, when also optimizing 
the orbitals, the near redundancies between some orbital 
and exponents parameters make adiag increase anyway. 
Thus, we have not found it worthwhile for our purpose 
to use orthonormalized basis functions. 

For the first-row atoms, optimizing the exponents (si- 
multaneously with the other parameters) rather than us- 
ing the exponents of the CVBl basis typically yields im- 
provements of the total VMC energies between 0.1 and 1 
mHartree, which is at the edge of statistical significance 
and accuracy of the optimization. Thus, at the accuracy 
that we are concerned with, the exponents of the CVBl 
basis for these atoms arc nearly optimal for Jastrow- 
Slater wave functions. As expected, larger improvements 
are obtained for the first-row diatomic molecules. The 
largest improvement of the total energy is observed for 
the O2 molecule with a gain of about 3.4 mHartree in 
VMC and 1.2 mHartree in DMC with a Jastrow-Slater 
single-determinant wave function. The largest improve- 
ment of the standard deviation of the energy is ob- 
tained for the Be2 dinicr using a Jastrow-Slater single- 
determinant wave function, with a gain of 0.25 Hartree. 
(The standard deviation of the energy is ct = 0.57 Hartree 
with the CVBl exponents and a = 0.32 Hartree with 
the reoptimized exponents using pure energy minimiza- 
tion. A mixed minimization with an energy weighting 
of 0.95 and a variance weighting of 0.05 and reoptimized 
exponents results in the same energy but a cr of 0.32, 
whereas a pure variance minimization yields a variational 



energy that is higher by 1 mHartree and a cr of only 0.24.) 
Properties other than the energy might be more sensitive 
to the basis exponents. We note that the optimization 
method that we are using is designed to find local min- 
ima, and we cannot be sure that we have found the global 
minimum for the form of the trial wave function consid- 
ered. In particular, optimization of the exponent param- 
eters typically leads to multiple local minima. We have 
found that by optimizing the exponents it is possible to 
reduce the size of the basis, without sacrificing the en- 
ergy or the energy variance, but the results in this paper 
were all obtained using a basis size corresponding to the 
CVBl basis. A smaller basis has also the advantage of 
having fewer local minima. 

As noted in Ref. 10, because the virial theorem within 
the Born-Oppcnheimer approximation at the equilibrium 
nuclear geometry holds if the energy is stationary with 
respect to scaling of the electron coordinates, optimiza- 
tion of the basis exponents along with optimization of 
the scaling factors of the interelectron coordinates in the 
Jastrow factor permits one to satisfy exactly the virial 
theorem in VMC in the limit of infinite sample size. 



V. POTENTIAL ENERGY CURVE OF THE C2 
MOLECULE 

In Fig. 2, wc show the total energy curve of the 
all-electron C2 molecule as a function of the inter- 
atomic distance R calculated in (plot a) RHF, VMC 
and DMC with a fully optimized Jastrow x single- 
determinant wave function [JxSD], and (plot b) MC- 
SCF CAS(8,8), VMC and DMC with a fully optimized 
Jastrow X multi-determinant CAS (8, 8) wave function 
[Jx CAS(8,8)]. In each case, the horizontal line repre- 
sents twice the energy of an isolated atom calculated 
with the same method, and provides a check of the size 
consistency of the method. We stress that the wave 
functions have been optimized by energy minimization 
rather than variance minimization, and in appendix A we 
present an argument suggesting that, as regards size con- 
sistency, energy-optimized wave functions are to be pre- 
ferred over variance-optimized wave functions. For com- 
parison, we also plot a Morse potential [56] , -Emoisc (^) = 
Ecxact(-fi'e) + jJe(l -e-"')' where X = 2p{R-R,)/R, and 
f3 = iUe/{^\/ BeDe), using an estimate of the exact energy 
at equihbrium Eexact{Re) = —75.9265 Hartree [57] and 
accurate spectroscopic constants: equilibrium distance 
Re = 2.3481 [14], well-depth = 6.44 eV [57], first vi- 
brational frequency uje = 1855 cm~* [58] and rotational 
constant Be = 1.81984 cm~^ [59]. For analysis, we report 
in Table I the distribution of the four tt electrons among 
the two carbon atoms A and B in the dissociation limit, 
the remaining eight a electrons being unimportant for 
the study of the dissociation. 

We note that Sorella et al. [60] have also reported re- 
cently QMC calculations of the potential energy curve 
of the C2 molecule, using a pseudopotcntial and Jastrow 
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FIG. 2: Total energy of the all-electron C2 molecule as a function of the interatomic distance R calculated in (plot a) RHF, 
VMC and DMC with a fully optimized Jastrow x single determinant wave function [J x SD], and (plot b) MCSCF CAS(8,8), 
VMC and DMC with a fully optimized Jastrow x multi-determinant CAS(8,8) wave function [J x CAS(8,8)], using the CVBl 
Slater basis form [43]. In each case, the horizontal line represents twice the energy of an isolated atom calculated with the same 
method, and provides a check of the size consistency of the method. For comparison, a Morse potential [56] using accurate 
spectroscopic constants is also shown (see text). 



TABLE I: Distribution of the electrons among the two carbon atoms A and B of the C2 molecule in the dissociation limit. 
For the neutral dissociations, only the distribution of the four tt electrons are considered since the remaining eight a electrons 
are unimportant for the study of the dissociation. The same methods used in Fig. 2 are compared. For the RHF and MCSCF 
wave functions, the percentages of the distributions can be calculated analytically. 





neutral dissociation 


ionic dissociation 




Aiti)+B{tl) 


^(TT) + i3ai) 






RHF 


25 % 


6.25 % 


6.25 % 


62.5 % 


VMC JxSD 


^ 43 % 


^ 13 % 


^ 13 % 


^ 31 % 


DMC JxSD 




^ 50 % 


^ 50 % 


^0% 


MCSCF CAS(8,8) 


33.33 % 


33.33 % 


33.33 % 


% 


VMC JxCAS(8,8) 


^ 33 % 


^ 33 % 


^ 33 % 


^0% 


DMC JxCAS(8,8) 


^ 33 % 


^ 33 % 


fa 33 % 


^0% 



antisymmetrized geminal power wave functions. 

At very large interatomic distances, lack of ergodicity 
in the QMC calculations may be an issue, as electrons 
tend to remain stuck around an atom, and nonequili- 
brated results can be obtained. In VMC calculations it 
is always possible to make large moves of the electrons be- 
tween the two atoms, as done for example in Ref. 60, but 
this is not possible in DMC calculations where dynamics 
of the moves is specified and becomes exact only in the 
small time-step limit. One can nevertheless avoid being 
deceived by using a large population of walkers (thereby 
improving the sampling of the configuration space), look- 
ing at the evolution of the results as the bond is stretched, 
and performing several runs with different starting loca- 
tions of the walkers. 

We first discuss the single-determinant case. RHF is of 
course not size consistent, and leads to a large percent- 
age of incorrect ionic dissociations (62.5%). Our Jastrow 
factor has a multiplicatively separable form (at dissocia- 
tion, it reduces to the product of the Jastrow factors cm- 
ployed for the isolated atoms), so that fulfillment of size 



consistency in VMC calculations is only dependent on 
the dcterminantal part of the wave function. The VMC 
calculation using a single determinant is not size consis- 
tent, but the Jastrow factor reduces the size-consistency 
error and decreases the percentage of ionic dissociations 
to about 31%. Interestingly, the DMC calculation us- 
ing the nodes of a non-size-consistent single-determinant 
trial wave function appears to be size consistent within 
the accuracy of the calculation, ionic dissociations be- 
ing absent. Moreover, examination of the distribution of 
electrons in the DMC calculation shows that the distri- 
bution ^(ti) + ^(Ti) has a vanishing probability in the 
dissociation limit (see Table I). Only the distributions 
A(tT) + B{li) and A{U) + S(tT) remain at dissocia- 
tion. In appendix B, we show that this implies that the 
singlet-spin symmetry of the ground-state is broken with 
an expectation value of the total spin operator over 
the FN wave function of 2. In quantum chemistry, it is 
well known that spin (and/or spatial) symmetry break- 
ing frequently occurs in unrestricted Hartree-Fock or un- 
restricted Kohn-Sham calculations at large interatomic 
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FIG. 3: Error in well depths of the first-row homonuclear 
diatomic molecules using several computational methods (see 
text and Table II). 



distances where electron correlation gets stronger. Our 
results show that spin-symmetry breaking can also occur 
in DMC calculations even using an unbroken-symmetry 
trial wave function, meaning that its nodal surface does 
not impose the spin symmetry. One may wonder how 
the repeated application of a spin-independent Green 
function to an initial trial wave function that is a spin 
eigenstate (neglecting the small spin contamination that 
can be introduced by imposing spin-dependent electron- 
electron cusp conditions in the Jastrow factor [61]) can 
result in a wave function that is not a spin eigenstate. In 
fact, breaking of spin symmetry is possible in DMC calcu- 
lations because the Green function is not applied exactly 
but only by finite sampling. After all, the fermion-sign 
problem can also been seen as resulting from the break- 
ing of the antisymmetry of the trial wave function due to 
finite sampling. 

We now discuss the multi-determinant case. The MC- 
SCF calculation in a full valence CAS, i.e. CAS(8,8) 
for the C2 molecule, is size consistent, the corresponding 
atomic calculation being taken as a MCSCF CAS (4,4) 
calculation as in Refs. 4 and 5. Not surprisingly, the cor- 
responding VMC and DMC calculations are also size con- 
sistent. The DMC energy curve agrees closely with the 
reference Morse potential. For these three calculations, 
at the dissociation hmit, the distributions A(ti) + i?(ti), 
^(TT) + B{ii) and A{ll) + B(tT) are obtained with 
equal weights, which is expected for a proper spin-singlet 
wave function describing two dissociated carbon atoms, 
as noted in Ref. 60. 



VI. RESULTS ON FIRST-ROW ATOMS AND 
HOMONUCLEAR DIATOMIC MOLECULES 

In Table II, we report total energies of the first- 
row atoms and homonuclear diatomic molecules at their 



experimental bond length using several computational 
methods: RHF, MCSCF in a full valence complete ac- 
tive space (FVCAS), VMC with Jastrow x single deter- 
minant [JxSD] and Jastrow x multi-determinant FV- 
CAS [J X FVCAS] wave functions (where the Jastrow, 
CSF, orbital and exponent parameters have been simul- 
taneously optimized), and DMC with the same JxSD 
and JxFVCAS wave functions. For atoms, the ac- 
tive space of FVCAS wave functions consists of the 2s 
and 2p orbitals. For the molecules, it consists of all 
the orbitals coming from the n = 2 atomic shells, i.e. 
2crg2(T„3crgl7r„_2;l7r„^j^l7rg^a,l7rg^2^3cr„ (this is the energy or- 
dering of the Hartree-Fock orbitals for 5 molecules out of 
the 8 molecules). For the atoms Li, N, O, F and Ne, 
and for the dimer Ne2, orbital occupations and symme- 
try constraints imply that the FVCAS wave functions 
contain only a single determinant. Thus, for these sys- 
tems, the FVCAS MCSCF wave functions are identical to 
the RHF wave functions, and the JxFVCAS wave func- 
tions are identical to JxSD wave functions. The well 
depths (dissociation energy -f- zero-point energy) have 
been calculated consistently by using single-determinant 
wave functions for both the molecule and the atom, or 
multi-determinant FVCAS wave functions for both the 
molecule and the atom. The errors of the computed well 
depths are plotted in Fig 3. 

The largest errors of the DMC total energy using 
JxFVCAS wave functions are obtained for the heaviest 
systems and are of the order of 15 mHartree for the atoms 
and 30 mHartree for the molecules. Of course, one can 
always improve the total energy by increasing the num- 
ber of CSFs as done for example by Brown et al. [8], but 
good well depths are already obtained with JxFVCAS 
wave functions due to a compensation of errors between 
the atoms and the molecule. DMC calculations using 
JxFVCAS wave functions give well depths with near 
chemical accuracy (1 kcal/mol « 0.04 eV), the largest 
absolute error being of about 0.1 eV for the N2 molecule. 
In particular, we note that, although the Be2 dimer is 
unbound at the RHF, MCSCF and VMC level, the weak 
bond is well reproduced at the DMC level. Because of 
the extremely weak van der Waals bond of the Ne2 dimer 
we computed the DMC energies of Ne and Ne2 at four 
time steps t = 0.020, 0.015, 0.010, and 0.005 Hartree"! 
and extrapolated to zero time step. The time-step error 
at T = 0.01 for Ne2 was —0.0023 Hartree whereas that 
for Ne was -0.00068 Hartree. 



VII. CONCLUSIONS 

To summarize, we have extended our earlier published 
linear optimization method to allow for nonorthogonal 
orbitals. This then makes it possible to optimize all 
the parameters in the wave function, including the ba- 
sis exponents. Moreover, by noting that the linear opti- 
mization method can be seen as an augmented Hessian 
method, we have shown that it is possible to minimize a 
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TABLE II: Total energies (in Hartree) and well depths (in eV) of first-row atoms and homonuclear diatomic molecules at their 
experimental bond lengths Ro (in Bohr) using several computational methods (see text). The RHF and MCSCF calculations 
have been performed with the Slater CVBl basis set [43], expanding each Slater function into 14 Cartesian Gaussian functions. 
The QMC calculations have been performed with the true Slater basis set rather than its Gaussian expansion. The Jastrow, 
CSF, orbital and exponent parameters of the Jastrow-Slater wave functions have been optimized in VMC, and the resulting 
wave functions have been used in DMC. The DMC energies are for time step t — 0.01 Hartree"^, with the exception of Ne 
and Ne2 for which the energies extrapolated to r = are given. The well depths have been calculated consistently using 
single-determinant (SD) wave functions for both the molecule and the atom, or, full valence complete active space (FVCAS) 
multi-determinant wave functions for both the molecule and the atom. 









Atoms 






Li es) 


Be CS) 


B (2p) C (^P) 


N C'S) 


O CP) F CP) 


Ne CS) 






Numbers of CSFs 


m FVCAS 


wave functions 




1 


2 


2 2 


1 


1 1 


1 



Total energies (Hartree) 



RHF 


-7.43271 


-14.57299 


-24.52903 


-37.68849 


-54.40060 


-74 


.81065 


-99.40937 


-128.54556 


MCSCF FVCAS 




-14.61663 


-24.56372 


-37.70777 












VMC JxSD 


-7.47793(5) 


-14.64972(5) 


-24.62936(5) 


-37.81705(6) 


-54.5628(1) 


-75 


.0352(1) 


-99.7003(1) 


-128.9057(1) 


VMC J X FVCAS 




-14.66668(5) 


-24.64409(5) 


-37.82607(5) 












DMC JxSD 


-7.47805(1) 


-14.65717(1) 


-24.63990(2) 


-37.82966(4) 


-54.57587(4) 


-75 


.05187(7) 


-99.71827(5) 


-128.92346(3) 


DMC J X FVCAS 




-14.66727(1) 


-24.64996(1) 


-37.83620(1) 












Estimated exact 


-7.47806" 


-14.66736" 


-24.65391" 


-37.8450" 


-54.5892" 


-75 


.0673" 


-99.7339" 


-128.9376" 



Li2 (^E+) Be2 (^E+) B2 (^S;) C2 (^E 



Molecules 

t) N2 (^E^ 



O2 (^E,-) F2 (^E+) Ne2 (^E+) 



5.051" 



4.65" 



3.005'' 



Interatomic distances (Bohr) 



2.3481" 



2.075° 



2.283" 



2.668" 



5.84» 



38 



137 



Numbers of CSFs in FVCAS wave functions 
165 107 30 



RHF 

MCSCF FVCAS 
VMC JxSD 
VMC J X FVCAS 
DMC JxSD 
DMC J X FVCAS 
Estimated exact 



14.87127 

14.89758 

14.98255(5) 

14.99229(5) 

14.99167(2) 

14.99456(1) 

14.995(1) 



-29.13148 

-29.22111 

-29.29768(4) 

-29.33180(5) 

-29.31895(5) 

-29.33736(2) 

-29.3380(4) 



-49.08961 

-49.22009 

-49.3457(5) 

-49.3916(2) 

-49.38264(9) 

-49.4067(2) 

-49.415(2) 



Total energies (Hartree) 



-75.40154 

-75.63991 

-75.8088(5) 

-75.8862(2) 

-75.8672(1) 

-75.9106(1) 

-75.9265^ 



-108.98650 

-109.13585 

-109.4520(5) 

-109.4851(3) 

-109.5039(1) 

-109.5206(1) 

-109.5427-'^ 



■149.65881 

■149.76453 

■150.2248(5) 

■150.2436(2) 

■150.2872(2) 

■150.29437(9) 

■150.3274^ 



Well depths (eV) 



■198.76323 

■198.84307 

■199.4209(5) 

■199.4443(3) 

■199.4861(2) 

■199.4970(1) 

■199.5304-'^ 



-257.09105 
-257.80956(2) 
-257.84707(5) 
-257.8753 



RHF 


0.159 


-0.395 


0.858 


0.668 


5.042 


1.021 


-1.510 


-0.00189 


MCSCF FVCAS 


0.875 


-0.331 


2.521 


6.105 


9.106 


3.897 


0.662 




VMC JxSD 


0.726(3) 


-0.048(3) 


2.367(3) 


4.75(1) 


8.88(1) 


4.20(1) 


0.55(1) 


-0.050(5) 


VMC J X FVCAS 


0.991(3) 


-0.042(3) 


2.814(6) 


6.369(6) 


9.78(1) 


4.713(8) 


1.19(1) 




DMC JxSD 


0.9679(8) 


0.125(1) 


2.798(3) 


5.656(3) 


9.583(3) 


4.992(7) 


1.349(6) 


0.004(2) 


DMC J X FVCAS 


1.0465(6) 


0.0767(8) 


2.906(3) 


6.482(3) 


10.037(3) 


5.187(5) 


1.645(4) 




Estimated exact 


1.06(4)" 


0.09(1)" 


2.92(6)" 


6.44(2)^ 


9.908(<3)-^ 


5.241(<3)^ 


1.693(5)-' 


0.00365" 



Ref. 62, " Ref. 59, " Ref. 58, " Ref. 14, " Ref. 63, ' Ref. 57, » Ref. 64. 



linear combination of the energy and the energy variance 
with the hnear optimization method. We have applied 
the method to the calculation of the full ground-state 
potential energy curve of the C2 molecule, and we have 
shown that although a VMC calculation using a spin- 
restricted single-determinant Jastrow-Slater wave func- 
tion is not size consistent, the corresponding DMC calcu- 
lation using the same trial wave function is size consistent 
within statistical uncertainty. The price to pay for this 



size consistency is the breaking of the spin-singlet sym- 
metry at dissociation: the fixed-node DMC wave func- 
tion has an expectation value of 2 for the total spin op- 
erator 5^, although the spin-singlet trial wave function 
is an eigenstate of 5^ with eigenvalue 0. Of course, us- 
ing multi-determinant FVCAS Jastrow-Slater wave func- 
tions, both the VMC and the DMC calculations arc size 
consistent without breaking of spin symmetry. Finally, 
we have performed calculations on the first-row atoms 
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and homonuclear diatomic molecules and showed that 
well depths can be computed with near chemical accu- 
racy using just fully optimized multi-determinant FV- 
CAS Jastrow-Slater wave functions. 



Acknowledgments 

We thank Peter Nightingale, Sandro Sorella, Andreas 
Savin, Frank Petruzielo, Roland Assaraf, Benoit Braida 
and Paola Gori-Giorgi for valuable discussions. We also 
thank Alexander KoUias, Peter Reinhardt and Roland 
Assaraf for providing us with the Gaussian fits of Slater 
basis functions of Ref. 46. This work was supported 
in part by a European Marie Curie Outgoing Inter- 
national Fellowship (039750-QMC-DFT), the National 
Science Foundation (EAR-0530301) and the DOE (DE- 
FG02-07ER46365). Most of the calculations were per- 
formed on the Intel cluster at the Cornell Nanoscale 
Facility (a member of the National Nanotechnology In- 
frastructure Network supported by the National Science 
Foundation) and at the Cornell Theory Center. 



consistency of the method is that it leads to an (approx- 
imate) wave function for the composite system of the 
product form 



\AB)p = V^^V'slvac) 



(A3) 



i.e., the wave function is multiplicatively separable. How- 
ever, this is not a necessary condition, as exemplified by 
perturbation theory (see, e.g., the discussion in Ref. 66). 
Also, in general, due to the nonlocality of the total spin 
operator S'^ , one has in fact to consider a sum of products 
of degenerate spin-multiplet component wave functions of 
the fragments to accommodate non-singlet spin symme- 
try, but it is sufficient to take the simple product form 
of Eq. (A3) for our purpose. If the wave function of the 
system AB has this product form, then it is easy to show 
that the energy variance is also additively separable 



Vab ^ Va + Vb, 



(A4) 



where Vab = {AB\{Hab - Eab?\AB) / {AB\AB) is the 
energy variance of the system AB, and Va = {A\{Ha — 
EAf\A)l{MA) and Vb = {B\{Hb - EBf\B) / {B\B) are 
the energy variances of the fragments. 



APPENDIX A: A REMARK ON SIZE 
CONSISTENCY AND VARIANCE 
MINIMIZATION 



Multiplicative separability of energy-optimized 
linear wave functions 



In this appendix, we briefly review the concept of size 
consistency of an electronic-structure method (see, e.g., 
Refs. 20, 65, 66 for more details), and we give an argu- 
ment for preferring energy-optimized wave functions over 
variance-optimized wave functions as regards size consis- 
tency. 



1. Definition of size consistency 

Consider an electronic system AB made of two nonin- 
teracting fragments A and B (e.g., a diatomic molecule 
at dissociation). This system has a Hamiltonian 



Hab ^Ha + He 



(Al) 



where Ha and Hb arc the Hamiltonians of the frag- 
ments, commuting with each other. If Ea and Eb are the 
(approximate) energies of the fragments given by some 
method, and Eab is the (approximate) energy of the 
composite system given by the same method, then this 
method is said to be size consistent if and only if 



Eab — Ea + Ei 



(A2) 



i.e., the energy is additive. 

In particular, if \A) = iPaWslc) and \B) = iPbW&c) are 
the (approximate) wave functions given by the method 
under consideration where TpA and ipB are second- 
quantized wave operators (commuting or anticommut- 
ing with each other), then a sufficient condition for size 



Before discussing variance-optimized wave functions, it 
is useful to repeat briefly the proof of the multiplicative 
separability of energy-optimized linear wave functions, 
given for instance in Ref. 20. 

Consider that the fragments are described by the fol- 
lowing linearly parametrized (approximate) wave func- 
tions 



CiAllA) 



and 



\B) 



^CjB\jB) 



(A5a) 



(A5b) 



where |i^) = -^i^jvac) and jj^) = ■(/ijslvac) are some 
many-body basis states, and the coefficients Cia and Cjb 
are determined by requiring the stationarity of the energy 
of each fragment 



OEa 
dc 



{ia\Ha~Ea\A) 



■lA 



{A\A) 



dEs ^^il 



b\Hb 



Eb\B) 



dcji 



BB) 



0, 



= 0. 



(A6a) 



(A6b) 



Correspondingly, consider that the composite system is 
described by a linearly parametrized wave function in the 
product basis \iAiB) = V'iAV'jslvac) 



\AB) 



\^A]B) 



(A7) 
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where the coefficients are also consistently determined V^g stationary 
by imposing the stationarity of the energy 



OEab _ r, {iA.]B\HAB - Eab\AB) 
9cy (AB\AB) 



(A8) 



It is then easy to see that the product wave function 
\AB)p = TpAipsl'^^c) makes the corresponding energy 
E^g stationary 



= 2 



{iaIHa-EaIA) UbIB) 



{A\A) 
{jb\Hb- 



{B\B) 
Eb\B) {ia\A) 



(BIB) 



{A\A) 



0, (A9) 



since both terms vanish according to Eqs. (A6). The 
product wave function is thus a possible solution, and, if 
this is the actual solution given by the method, then the 
method is size consistent. 

As an aside, we note that the energy of the product 
wave function will converge exponentially to the sum of 
the constituent energies as a function of the interfrag- 
ment distance because the overlap of the fragment wave 
functions decays exponentially. On the other hand, the 
true wave function has an energy that converges only as 
an inverse power to the sum of the constituent energies 
(Van der Waals interaction). 



dci. 



{tA\{HA-EA)^'VA\A) (jb\B) 



{A\A) 
{jbKHb-Ee 



{B\B) 
Vb\B) {ia\A) 



{B\B) {A\A) 
{iaIHa-EaIA) {jb\Hb-Eb\B) 



{A\A) 



(BIB) 



^0, 



(A12) 



since the last term in Eq. (A12) is the product of the 
energy gradients of the fragments which do not now gen- 
erally vanish. Thus, the wave function minimizing the 
energy variance of the composite system is not a product 
wave function. This suggests that variance minimiza- 
tion does not generally yield additively separable ener- 
gies, Eab 7^ Ea + Eb- Further, since the variance is 
additive for the product wave function, if the method 
minimizes over a space that includes the product wave 
function, then Vab i£ Va + Vb- This happens by hav- 
ing anticorrelated energy fluctuations on the two frag- 
ments. Of course, in the limit of exact wave functions, 
the gradients of the energy and of the energy variance 
simultaneously vanish, and size consistency is ensured. 
Consequently, the magnitude of the violation of size con- 
sistency of variance-optimized linear wave functions is ex- 
pected to become smaller as the wave function becomes 
more accurate. 



3. Lack of multiplicative separability of 
variance-optimized linear wave functions 



APPENDIX B: SPIN-SYMMETRY BREAKING 
IN DMC FOR THE C2 MOLECULE AT 
DISSOCIATION 



In the case of variance-optimized linear wave functions, 
the coefficients CiA and cjb of the fragment wave func- 
tions of Eq. (A5) are determined by requiring the sta- 
tionarity of the energy variance 



dVA 

dCiA 



OVb 



{ia\{Ha - Ea)^ - Va\A) 
{A\A) 



{jB\iHB-EB)^-VB\B) 



dc 



■jB 



{B\B) 



0, (AlOa) 



(AlOb) 



Correspondingly, the coefficients Cy of the composite 
wave function of Eq. (A7) are also determined by impos- 
ing stationarity of the energy variance 



dVAi 

da. 



{iAjB\{HAB - Eab? - Vab\AB) 
{AB\AB) 



(All) 

In contrast to the case of energy-optimized wave func- 
tions, the product wave function \AB)p = -0^-0s|vac) 
now does not make the corresponding energy variance 



In this appendix, we show how the information on the 
real-space location of the electrons in DMC calculations 
of the C2 molecule at dissociation using a spin-restricted 
single-determinant trial wave function reveals that the 
spin-singlet symmetry of the exact ground state is broken 
in the FN wave function. 

For that, we need to determine the expectation value 
of the total spin operator S*^ over the FN wave func- 
tion. As we use in the QMC calculation a real-space 
spin-assigned wave function, we first need to reconsti- 
tute the corresponding total wave function. Reference 61 
shows how to do so using straightforward first quantiza- 
tion. Here, we use the alternative formalism of real-space 
second quantization. 

In this formalism, the total wave function \T) in ab- 
stract Hilbert space corresponding to a TV-electron real- 
space spin-assigned wave function 'I'(ri, r2, • • • ,r7v) = 
(ri |,r2 I,-- - ,rN I \'^) with spin-up followed by 
Ni spin-down electrons is written as 

|r) = J dvidv2 ■■■dYN *(ri, r2, ■■■ ,vn) 

^{v^)^{v2)■■4\{rN)\y^c), (Bi) 
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where i^'^ir) is the fermionic field creation operator at 
point r and spin a. In Eq. (Bl), \I>(ri, r2, • • • ,rjv) is 
taken as antisymmetric under the exchange of two same- 
spin electron space coordinates and normalized to unity, 
i.e. /dri(ir2---drjv|^'(ri,r2,--- ,rAr)p = 1, implying 
that |r) is also normalized to unity, (r|r) = 1. Even 
if the wave function j^f) is not antisymmetric under the 
exchange of two opposite-spin electrons (product of spin- 
up and spin-down determinants) , the total wave function 
|r) is always fully antisymmetric. 

To study the dissociation of the C2 molecule, it is suffi- 
cient to consider only the four tt electrons. The total FN 
wave function IFfn) corresponding to the spin-assigned 
real-space FN wave function \E'FN(ri, r2, ra, r4) is thus 
written as 

|Ffn> = ^ y firi(ir2dr3dr4*FN(ri,r2,r3,r4) 

V^l (n )^| (r2)^| (r3)V^| (r4) I vac) , (B2) 
with the antisymmetry constraints 

^'FN(r2,ri,r3,r4) = -*FN(ri, r2, r3, r4), (B3a) 

*FN(ri,r2,r4,r3) = -*FN(ri, r2, rs, r4). (B3b) 

Examination of the real-space location of the elec- 
trons during DMC calculations using a spin-restricted 
single-determinant trial wave function shows that, 
in the dissociation limit, the mixed distribution 
\I'(ri, r2, r3, r4)^'FN(ri, r2j r3, r4) vanishes whenever two 
electrons of opposite spins are in the neighborhood of 
the same C nucleus. In contrast, we know from VMC 
calculations that the trial wave function \E'(ri, r2, r3, r4) 
does not forbid dissociation with two opposite-spin elec- 
trons around the same nucleus, thus we conclude that it 
is the FN wave function ^'FN(ri, r2, rs, r4) that vanishes 
for this type of dissociation. In other words, it means 
that at dissociation ^'fnCi'i, r2, r4, r3) can be written as 
(assuming that inversion symmetry is preserved) 

*FN(ri,r2,r4,r3) = /^(ri, r2),9i3(r3, r4) 

+,9i3(ri,r2)/A(r3,r4), (B4) 



where /a and qb are antisymmetric two-electron func- 
tions localized around nuclei A and B, respectively. 

We now investigate the spin symmetry of this FN wave 
function. First, using the anticommutation rules of the 
field operators, it is easy to show that the wave func- 
tion IFfn) is an eigenstate of the spin-projection operator 
Sz = (1/2) / (ir[?/;|(r)?/j-|-(r) — '0|(r)i/)|(r)] with eigenvalue 
zero 

5.|Ffn)=0, (B5) 

for any function 'I'FN(ri, r2, r4, r3). The action of the 
total spin operator = S+S- + S^iSz — 1) with S+ = 
J dr V;|(r)^/'|(r) and S- = J dr'0|(r)V;-f (r) on IFfn) gives 

5'^|Ffn) ^ y"dridr2dr3(ir4[2^'FN(ri,r2,r3,i"4) 
-4*FN(ri , r3, r2, r4)]V;j iri)^ (rz)^^! (r3)v3j (r4) |vac) , (B6) 

where the anticommutation rules of the field operators 
and permutations of electron space coordinates have been 
used. Equation (B6) shows that |Ffn) is generally not an 
eigenstate of S^. At dissociation, it is nevertheless pos- 
sible to calculate the expectation value of over |Ffn) 

(FFNlS'^lrFN) = 2 - y firifir2fir3(ir4l'FN(ri,i'2,i'3,r4) 

x*FN(ri,i'3,r2,r4) 
= 2, (B7) 

since the last integral vanishes due to the localized form 
of ^'FN(i"i,i"3,r2,r4) given in Eq. (B4). 

In conclusion, we have shown that the singlet-spin sym- 
metry of the ground-state of the C2 molecule is broken in 
the FN wave function at dissociation using the nodes of 
a spin-restricted single-determinant wave function. The 
expectation value of over the FN wave function is 2, 
which is identical to the value found for the lowest broken 
symmetry solution of the unrestricted Hartree-Fock [67] 
or unrestricted Kohn-Sham equations with usual approx- 
imate density functionals [68]. 
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